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ABSTRACT 



This report is the result of a research effort that 
tried to find out what determines how much a student learns during 
his 4 years in college. The major purpose was to find partial answers 
to two basic questions. (1) If the input with respect to student 
ability is held constant, will identifiable groups of colleges have 
graduates showing greater gain in achievement than others? (2) 
Contingent on demonstrating differential gains between colleges, what 
are the characteristics of the most and least effective schools? The 
control variables were the verbal and mathematical scores of the SAT 
and ti:e student's major field of study. The output performance 
variables were the area tests of the GRE Institutional Testing 
Program. The latter are considered achievement tests of institutional 
effectiveness. Institutional resources were also considered. Most of 
the colleges in the sample were small and included many types of 
liberal arts institutions. Results indicated that 85% to 91% of the 
between college variance was predictable from student input. A small 
but significant proportion was predictable from income per student, 
the proportion of faculty with a doctorate, full time equivalent, and 
the interaction of these 3 variables for all but the GRE-Social 
Science^ (AF) 
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The Identification and Evaluation of College 
• • 

Effects on Student Achievement * ; 

As greater numbers of young people continue. on to college^ it becomes of 

••• * • • * * 

increasing concern to know what determines how much a student learns during his. ' , 

four years in college . Such information is important, not only to the theorist 
v^ho is * attempting to understand how "and to vjhat extent college characteristics 
influence student behavior, but to the college administrator who requires such 
inforniation for decisions concerning the optimal allocation, of limited funds 
.among -many competing educational programs and processes. In addition,- the 
recent increase in student population has been accompanied by an ever in- 
creasing floxT of both public and private funds into, the coHege system, 
resulting in an increasing need to evaluate the' potential payoff of differ- 
ential funding policies. 

Many of the differences among colleges with respect to their resources 
have been documented by Astin and Holland ( 1962 ), Car"bter (19^U), and the 
College Data Bank of Columbia *s Bureau of Applied Social Research • (I 966 ) . 

However, little additional light has been shed on whether or not these 
differences produce .different effects on students. Certainly ary study of 
•the impact of various colleges on students must "take into Account differences 
in students who choose -to attend particular colleges. Failure to account 
for student talent at the time of college entrance, for example, was a 
criticism of the well-knovm studies of Knapp and Goodrich (19^2) and of 
Knapp and Greenbaum (1953) ^ who attempted to identify highly productive 
institutions by using as criteria the number of advanced graduate degrees 
and other scholarly rewards attained by a given institution’s graduates. 
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Us.ing scores on ‘the National Merit Scholarship Qualifying Test as a 
control of academic ability prior to college and a sample of National Merit 
Scholars, Nichols (l96ii) and Astin (1968) found little relationship between 
i • institutional characteristics and student academic growth in college. Nichols 
employed a sample of 396 students at 91 colleges and used the Graduate Record 
lamination’s (GRE) Aptitude Test as the criterion variable. . On the basis 
of 669 students at 38 colleges, Astin more recently concluded that "traditional 

« I 

indices of institutional quality do not appear to contribute to student achieve- 
ment*’ (1968, p. 661). Several factors should be considered in evaluating the 
conclusion reached by these two studies. First, the small sample size and the 
restriction to National Merit Scholars only vrould appear to be less than desir- 
able ‘for generalization. And second, because of the small number of students 
from each institution, both studies used individual students as the unit of 
1 analysis rather than institutional mean scores. Thus, Astin* s independent 

effects of colleges appear quite small since he presented them as a percentage 

♦ 

of the total individual variance after adjustment for input rather than as the 
percentage of the ‘oetween school variance adjusted for input. This-Jise^of the 
ratios of school effects to the total individual variance may be misleading in 

* t 

that it tends to underestimate the school effect. Hox^ great the extent of 
underestimation is, of course, a function of the proportion of total variance 
which is accounted for by the between school variance. Finally, the procedure 
used to estimate the school effect provides relatively conservative estimates 
(VJerts, and Linn, 1968) . 

This study attempted to overcome some of the handicaps characterizing the 
Nichols and Astin studies by (l) selecting a larger sample of colleges char- 
acterized by a wider range of ability, (2) using the institution as the sampling 
unit and thus, partitioning the between school variance rather than the total 
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individual variance^ and (3) employing several different methodological ap~ 

♦ 

proaches . 

The major purpose of this research was to attempt to find partial ansvrers 
■ *to two basic questions; 

(1) If the input with respect to student ability is held constant, will 

« 

. ■'* • ■ ia..ntif iable groups of colleges have graduates showing greater gain 

in achievement than others, and * . 

(2) Contingent on demonstrating differential gains between colleges, what 

• ^ are the characteristics of the most and least effective schools? 



• , Kethod 

* * ' • * . 

input or control variables were the Verbal and Ifethematical scores of 

I ‘ • the Scholastic Aptitude Tests (SAT) and the student’s major field of study. 

' ■ The SAT was required or recommended for admission by each institution in the 
sample. The output performance variables were the Ai’ea Tests of the Graduate 



Eecord Examination (GRE) Institutional Testing Program. 'Each of the tests. 



i.e.. Social Science, Humanities, and Natural Science, is minutes in length 
‘ and is intended to assess the student’s grasp of basib» concepts plus his 

ability to apply them to the variety of types of material which are presented 
- for his interpretation (Lannholm, 1955) • Thus th6 Area. Tests are considered 

achievement tests of institutional effectiveness in these principal areas of 

* ' , •* 

learning. As an institutional measure, the tests are generally given oo 

seniors^ colleges that did not give the examination to all available seniors 
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(or at least to all members of a designated group, such as liberal arts 

♦ 

majors) were not included in this study. 

i 

The ooilego descriptive measures, taken from several sources, included: 
(1) mear-urcfl of ’’institutional resources,” specifically a decile ranlcing 
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ef the number of books, books per student, income per student, . 

* , - ■ * • * 

• • 

faculty per student, and proporta on of faculty with a doctorate j 
■ ’ also full time (equivalent) undergraduate enroliment, per student, 

expenditures, typo of control, percentage of students graduat3.ng in • 

four years, and the percentage of graduates continuing to graduate 

• ' ■ * ■ ■ 1 ■ • ! ■ ;■ 

' * ' , i or professional schools; 

• (2) estimated freshman orientation measures (Astin, 1965), including 

■ intellectualism, estheticisra, status, pragmatism, . and mascuUnity; 

\ ‘ . - • . . r . ... : . 

(3) college orientation measures according to Astin (196^), including 
realistic, scientific, social, conventional, enterprising,, and 

- artistic; ' ’ ■ ’ . : ^ A'' 

•• . 

(U) average faculty compensation, and compensation per student as reported 

• ; . ■ * ■ 1 

.in the AAUP Journal (I968) . ■ : 

. • . : . . . ■ .. ^ • . ■ ■ J ' ' 

Only- the group of characteristics under (l) was used in the majority of the 
analyses because groups (2), (3), ^d (h) were unavailable for a' number' of ^ ■ 

colleges. ■' ' 

The sample included 95 colleges that administered the GRE Area-~4Fests in ^ 
1967 or 1968. Most college descriptive measures in group (l) above were 
available for 93 of these colleges. The 95 colleges also required or recom- 
-mended applicants to submit the SAT for entrance. From each of these colleges, 
a random sample of approximately 100 seniors who had completed the GRE Area- \ 
Tests vjas selected. For colleges vxith fewer than 100 seniors, the entire . ^ 

. 4 • . s . 

class vjas chosen. The ETS test files were then searched for the ^SAT scopes ■ 



The last two variables were taken from Cass and Birnbaum (I968) . The 
Other ^‘institutional resources” variables were compiled by Columbia! s*. Bureau 
'of**Appl.ied Social Research (1966) and based on 196B-6h ACE and p.SOE.Institu-tj 
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d . ... 

.... ““p**.. ....... , • 

. / ► • 

for these students, resulting in a final sample of 68^5 • This represented 

« " • « ♦ 

714 ;^ of the 9216 students selected in the GRE sampling. The majority of SAT ‘ 
scores were found in either the 19^3 or 196ij file years, although some vjcre ^ 
found in 1962 and 1965. Searches were not conducted beyond, those years. . . 

The institutions in this study were l^irgely private, only four being 
state colleges or universities . In general, student enrollment figures 
were modestj only ten had more than 2000 undergraduates, hone or which ap- . 

preached the large multiversity enrollments typified by some state and city 

, • * • * • 

universities. In addition to the public sector, the elitest private colleges 
' of the Northeast were also under-represented. Approximately half of the sample 
was at least loosely denominational, with this group divided about equally 

betvieen Catholic and Protestant denominations.; In sura," 'the sample, while not 

. ■ '* • , ^ . 

representative of all American higher education, at least included the many 
■ types of small liberal arts institutions. . 

A computer based procedure developed by. Rock, Barone- and Linn (I 968 ) was 
then used to form taxonomic groupings of colleges according to their relative 
profile similarity with respect to the descriptive characteristics A.'—rT-his 
system used an iterative 'procedure in an attempt to ma^mize tvxo objective 
functions, one of which (the predictive objective function) is associated 
with the input - output matrix and the second, called' the grouping objective 
function, yields an indication of the simil^ity of . prof iles among colleges 

within any one group or groups formed on the college descriptive variables' 

• , , , • . * 

or some subset of these descriptive variables. The .predictive objective 

function in this case attempted to maximize the between group variance of ' ■ 
the residuals (i.e., the mean predicted output subtracted from the mean ' . 

observed output within each of the homogeneous groups of- colleges) . That 
is, the computer procedure provided a means for searching for that subset 



o 

ERIC 






- 6 - 



of deKcriptive characteri sties Trom the total set vzhich yields • groups which , 

ma>:imize the above predictive objective function. The direction and size ^ 

■ ■ ■ • ; • ■ ‘ ; 

of those mean residuals indicated the relative gain or loss in achievement * 

... ‘ 

for any one cluster of colleges when the input was held constant. 

In addition to the above analyses, colleges with large positive or • , 

• ; , • . ' . ■ . 
large negative deviations from the regression .surf ace were* compared for. 

■ . • * ■ * . 
systematic differences on such characteris^cs . as type of, control, location 

and religious affiliation. ■ . ' -* ■ .* * 



Results 



In Table 1- the means, stand^d deviations, and intercorrelations among 
SAT, GRE Area Tests, and major area are reported for the sample of 6855 
seniors. For each of the GRE Area Tests either SAT-V or SAT-M correlated, 

. ...... , , . . . , . . • . I 

at . least ,oh or higher. These correlations are somewhat higher, than the • 

correlations bet\*jeen the National Merit Scholarship Qualifying Tests and 

the GRE Area Tests reported by Astin (1968) . It should also be noted that • 

there vras a. positive correlation betvxeen major field of study and the app^o- ‘ 

priate GRE Area Test, suggesting that major field shduid be taken into account 

when the output scores are adjusted for input. ■ . . ; • 

All of the subsequent analyses used the college as the sampling unit and 

thus it is the between college variance that was ^alyzed rather than the 

total variance. It would seem that the analysis of between college yariarice 

is more relevant • than the analysis of total individual yariance^. since the 
• . * ’ ‘ ' ' * . ' ‘ 

- primary concern is the identification of college characteristics which dis- ? 

. tinguish betv;een colleges with high and Iw output with input controlled. 

.' »• '• 

An;/ analysis of the total individual variance .makefe the implicit ’assumption.^ 



* that the college effect can bo measured within college. It is also assumed 

0 “ ■ ■ - - • ■ 
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that something is knovm about the extent and direction of the college effect 
on the heterogeneity of the vithin college variance.* It could b.e ar^ed that 

the effect of college would reduce the heterogeneity of the vxithin college 

# ' ■ , * • ‘ ■ * , 

variance. However, in the absence of empirical evidence, just the opposite 

' * , .... . 

might also be argued. Given this state of uncertainty it seems preferable to 
use only the between college variance. • ■ * 

In Table 2 the inter correlations among the college means based on stu- 
dents with SAT scores at the 93 colleges with descriptive data are presenteid. 

As can be seen, the. correlations between input (SAT mean) and output (GRE mean) 
are quite high. SAT-V means correlate .89 with GRE-HumcUiities means, and SAT-M 
means correlate . 9 I 3 *92, and .93 with GRE-Social Science,’ -Natui^al Science ana 
Total respectively. Substantial correlations were also obtained for- percentage 
ol* students majoring in Social Sciences and GRE-Social Science means (r =.3^)> 
percentage of students majoring in Humanities and G^ -Humanities means (r= .^l) 
and percentage of students majoring in Natural' Sciehces and GRE-Natiir^ Science 

means (r = .37) • ^ 

The correlations between the primary college descriptive characteristics 
and GRE Area Test moans and SAT means are reported i^ Table 3 . Income per 
student and proportion of faculty with doctorates had consisten-bly high cor- . 
relations for all three Area tests and for -the GRE total,. The faculty compen- 
sation variables vxere highly correlated with the GRE and SAT means; however 

. . « A *,*« 

these data were available for a limited number of colleges. • 

Since residual scores were to be used for many, of the analyses, an. 
attempt was made to estimate the stability of the residuals. The sample of 
students within each college was randomly divided into two subsamples and . 

GRE and SAT means were computed for each subsample.' The' correlations between 
the means for one subsainple and their counterparts in the. second subsamplc , 
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are reported in TabXe ij* These coiTelaiions ranged from s low ox .95. for 

GRE-Social So'i.ence and SAT-M to a high of .97 for GRE-Matural Science' and 

« 

GRE-Totalj indicating a high degree of stability for the college means. 

Of greater relevance are the correlations among the GRE residuals for 
subsample 1 with the corresponding residuals for subsainple. 2 V7hen one of the 
SAT scores was used as a predictor. These correlation^ betvjeen the residuals 
are reported in Table 5. The least stable residual was the GRE-Social Science . 
adjusted for SAT-M (r = .62) and the most stable residual was the GRE-ltaanities 
adjusted for SAT-V (r = .90) . In general, the residuals showed considerable 
stability, certainly sufficient to justify relating col^lege characteristics 

■ to the residuals . ■ ■ , ' : " , _ . . 

The multiple correlations of SAT ine^s and -proportion in major field . _ 

^fith gach of the GRE Area Test means are reported in Table 6. I*h6 inult.-,p— v; 

coi'relations ranged from .92 for Natural Sciences .to ..95. for Gp) Total.. The 

• ■ ■“ ■■ ■ ■■ - A ■■ ■ ■ :■ - ; 

squared multiple correlations indicate the proportion of the between college 
output variance that can be predicted from SAT means and proportion in major . ■ 
These squared multiple correlations ranged from .85 to .91 and thus approxi- 
• mately 9 to 15 percent of the between college output Variance could not be 

predicted from the input measures. ■ ' ■ • V ■■ 

•Using the computer based moderated i\.gression procedure which was described 

above, a subset of the college characteristics was selected which. maximized /. 

the objective function having to do with the betv/een group variance of residuals. 
Table 7 presents the means of the selected college characteristics and . associated . 
mean residuals for each group of colleges on each of the GRE measures . Group . 

1^ included 5U colleges characterised by relatively high income per student arid . 
a large proportion of faculty with doctorates This group had positive mean 
residuals on all three Area tests and the total. Group 2, v/hich was. comprised 
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of 29 colleges, had relatively low income per student and a relatively .small 

. proportion of faculty with doctorates. The mean' residuals for group 2 were 

the largest negative residuals in a3.1 three areas and the total. Group , • 

• * * * » * 

. • I . . .> » 

with an N of 10 colleges, was characterized by relatively low* income pen, stu-.. 

« • 

dent and large proportion of faculty with doctorates . . Group ’3 had the largest 
' positive mean residual. for Social Science but negative residuals for Humanities, 

■ Natural Science and Total. 

Inspection of Table 7 suggests that income per student differentiated 

■ - # “ 

... . ■ • ' f , 

group 1 from groups 2 and 3. while proportion of faculty with doctorate differ- 
entiated gr-oups 1 and 3 from group 2. This combination of income per student 
and proportion of faculty with doctorate corresponds to an apparent interaction' 
that was observed for GRE Social Science. That is, for.GRE Social Sciehce^ 

colleges vdth low income per student can be distinguished by what the income . 

■' ■ -■ ■ ■■ -■■■ • ■ ■■ • / * ' 

v;as spent on In short, it appears that low income colleges that spent money 

on obtaining a high proportion' of faculty with doctorates * did better in- Social 

Sciences than those that spent their money elsewhere. In Humanities, Natural 

Science and Total, hov7ever, income per student appears to be the overriding 

.• c.onsideration. -V • . 

“ . • * V 

Per student expenditures were also investigated but unlike income per ■ 

student, it did not discriminate betvreen the more effective and'les*s eff^ip- 

. tive schools. The per student expenditure information was obtained frora . . ' 

colleges on an Office of Education form 2000 and consisted of a-.vreighted . • 

• • • . * ■ 

• ■ * ' , - • . “ 
composite of the following items: l) general administration ^d' .general; . <» ■ 

. * * ’ ' ^ ^ '* ** ' ■ 'Z ^ ' 

expense, 2) instruction and departmental research, 3) libraries, and h) the .. 
operation and maintenance of the physical plant; Assuming these 'to be accu-; 
■rarely and uniformly reported by each college, one.' possible reason for its- ; .. 
ineffectiveness ii. that only one of the four specific expenditxires (instruction' 
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.I4I for GRE-l^atural Science. V/ith the exception of GRE-Social Science^ the 

multiple partial corredationc are statistically significant (p<.0^). The 

variables vdth the largest weights for all four criteria were F.T.E. and I/S 

♦ 

X F.T.E. • • ' 

•Astin*s (1965) freshman orientation and college orientation measures were 
also investigated. Iilhile some of these measures (particularily Selectivity) 
have high zero order correlation with GRE Area test- means, they were not found 
to be very useful in predicting the residual output measures largely because of 
the high correlations with- the input measures. These variables as v:ith the 
faculty compensation variables were investigated to only a limited extent in 
. the present study due to the fact that they were unavailable for a number of 
colleges in the study. Other variables which were considered but did not aid 
in the prediction of the residuals were location,, type of control, religious 
affiliation, and co-educational versus male or female institutions. 

; Viewing the results of the present study, several limitations should be 
considered. Since the sample was limited to colleges requiring both the GRE- 
Area Tests and the SAT, it cannot be construed as being representative "of the 
total population of colleges . In particular, certain variables such as size, 
type of control, and geographic location were re.stricted by the availability 
of data. As noted earlier there were relatively few large universities, state 
apported institutions, or engineering colleges." 

An even more serious restriction is the narrow nature of the criterion 
used as a rneasui'e of quality. Certainly there are many other outputs which 
should be evaluated in addition to achievement as measured by the GRE-Area 
Tests. But though the Ai'ea Tests measure only a narrow aspect of quality, 

,the fact that these colleges choose to use the GRE-Area Tests suggests that 
the;/ are relevant ‘to the general educational, goals of these institutions. 

i 
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In addition to effects that a college may have on mean student achieve- 
ment, colleges miglit be differentially effective v/ith different types of 
students. For example, t^io colleges might have equal mean residuals yet 
one college might achieve this with small gains for below average students 
and large gains for above average students, vdiereas the other college might 
achieve this with just the opposite pattern. Such vrithin college effects 
are beyond the scope of this study but are being pursued \n further research. 

Another limitation of this study is due to the limited nature of the 
college measures that were • investigated . More refined measures of income, 
expenditures, and faculty characteristics would seem to be desirable. Va' i- 
. ables more directly concerned with the extent and nature of student-faculty 
'interactions would also seem to be particularly relevant. ' ' 

Conclusions: - . ' ' ' - 

In this study, 85 to 91 percent of the between college variance was 
predictable from student input,' A small but significant proportion of the 
9 to 15 percent remaining between college variance was predictable fipm^ncome 
; per student, the proportion of faculty with a doctorate, full time equivalent 

• 'and the interaction of these three variables for all but GRE-Social Science. 

The extent of these effects was larger for the GRE-Katural Science, Humanities, 

• and Total than for GRE-Social Science. 

tJhile the present study analyzed the between-college variance rather than 

the total individual variance and used methodology (multiple partial correlation) 

which is more sensitive to the possibility of isolating college effects v/hen 

there is a high correlation betv/een such effects and injiuts, the results were 

not overly encouraging. Although the college effects appear somewhat larger 

than in previous studies of Nichols ( 196 I 4 ) and Astin (I968), the increments are 

■" limited practical significance. ■ ■ 

ERIC . . ■ ■ 
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Table 1 . 
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.' Correlations of Individual SA.T Scores^* 

Kajor Area and Individual GRB Area Test Scores 

. ‘ ' N=685^* • ■- ^ - 
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Tab3.e 2 



Correlations of SAT-Means, " 

Percentage in Ifejor Areas and GRE Area Test Mean? 
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Table 3 



Correlations between . College Descriptive Characteristics 
and GRE Area Test Means and SAT Means 
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Table U 
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Correlcd/ions of Means for Sub sample 1 
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Table 6 
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Multiple Correlations of Mean SAT-V, SAT-M and Proportion 

in Major Field with GRE Area Test Means • 

(N=95 Colleges) 
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Group Means on Selected College Characteristics 
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Table 8 ■ 

Multiple Partial Correlations and Standard 
Regression VJeights for Predicting 
Residual Output Means from College 
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